Hands-on Exercise 3B: Programming Animated Statistical Graphics with R

Author

Rachel Yee

Published

January 23, 2024

Modified

January 25, 2024

Overview

In this exercise, we will explore the dynamic realm of animated data visualization using the gganimate and plotly R packages. Along the way, we will also pick up on:

  • reshaping data through the tidyr package, as well as

  • mastering the art of processing, wrangling, and transforming data with the dplyr package.

The focus is on creating engaging visual narratives that captivate audiences through animated graphics, leaving a deeper impression compared to static visuals.

Basic concepts of animation

To create animations, we construct numerous individual plots, each representing a specific moment, and then assemble them into a sequence, much like creating a flip book or cartoon. These separate frames, each built from a specific part of the overall data, come together to create the illusion of motion when played in succession.

Terminology

Key concepts and terminology:

  1. Frame: In an animated line graph, each frame represents a different point in time or a different category. When the frame changes, the data points on the graph are updated to reflect the new data.

  2. Animation Attributes: The animation attributes are the settings that control how the animation behaves. For example, you can specify the duration of each frame, the easing function used to transition between frames, and whether to start the animation from the current frame or from the beginning.

Tip

Before you start making animated graphs, you should first ask yourself: Does it makes sense to go through the effort? If you are conducting an exploratory data analysis, a animated graphic may not be worth the time investment. However, if you are giving a presentation, a few well-placed animated graphics can help an audience connect with your topic remarkably better than static counterparts.

Getting Started

Installing and importing R packages

In this exercise, the following R packages will be used:

plotly for plotting interactive statistical graphs
gganimate an ggplot extension for creating animated statistical graphs
gifski converts video frames to GIF animations using pngquant’s fancy features for efficient cross-frame palettes and temporal dithering. It produces animated GIFs that use thousands of colors per frame
gapminder an excerpt of the data available at Gapminder.org. We just want to use its country_colors scheme
tidyverse a family of modern R packages specially designed to support data science, analysis and communication task including creating static statistical graphs
pacman::p_load(readxl, gifski, gapminder, plotly, gganimate, tidyverse)

Importing the Data

In this hands-on exercise, the Data worksheet from GlobalPopulation Excel workbook will be used.

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate_each_(funs(factor(.)), col) %>%
  mutate(Year = as.integer(Year))
Things to learn from the code chunk above
  • read_xls() of readxl package is used to import the Excel worksheet.

  • mutate_each_() of dplyr package is used to convert all character data type into factor.

  • mutate of dplyr package is used to convert data values of Year field into integer.

Unfortunately, both mutate_each_() and funs() were deprecated. In view of this, we will re-write the code by using mutate_at() as shown in the code chunk below.

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate_at(col, as.factor) %>%
  mutate(Year = as.integer(Year))

Instead of using mutate_at(), across() can be used to derive the same outputs.

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate(across(col, as.factor)) %>%
  mutate(Year = as.integer(Year))

Animated Data Visualisation: gganimate methods

gganimate extends the grammar of graphics as implemented by ggplot2 to include the description of animation. It does this by providing a range of new grammar classes that can be added to the plot object in order to customise how it should change with time.

transition_*() defines how the data should be spread out and how it relates to itself across time
view_*() defines how the positional scales should change along the animation
shadow_*() defines how data from other points in time should be presented in the given point in time
enter_*()/exit_*() defines how new data should appear and how old data should disappear during the course of the animation
ease_aes() defines how different aesthetics should be eased during transitions

Building a static population bubble plot

ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young') 

Building the animated bubble plot

In the code chunk below,

  • transition_time() of gganimate is used to create transition through distinct states in time (i.e. Year).

  • ease_aes() is used to control easing of aesthetics. The default is linear. Other methods include: quadratic, cubic, quartic, quintic, sine, circular, exponential, elastic, back, and bounce.

ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young') +
  transition_time(Year) +       
  ease_aes('linear')          

Animated Data Visualisation: plotly

In Plotly R package, both ggplotly() and plot_ly() support key frame animations through the frame argument/aesthetic. They also support an ids argument/aesthetic to ensure smooth transitions between objects with the same id (which helps facilitate object constancy).

Building an animated bubble plot: ggplotly() method

The animated bubble plot above includes a play/pause button and a slider component for controlling the animation

gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', 
       y = '% Young')

ggplotly(gg)
Things to learn from the code chunk above
  • Appropriate ggplot2 functions are used to create a static bubble plot. The output is then saved as an R object called gg.

  • ggplotly() is then used to convert the R graphic object into an animated svg object.

Notice that although show.legend = FALSE argument was used, the legend still appears on the plot. To overcome this problem, theme(legend.position=‘none’ should be used as shown in the plot and code chunk below.

gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', 
       y = '% Young') + 
  theme(legend.position='none')

ggplotly(gg)

Building an animated bubble plot: plot_ly() method

bp <- globalPop %>%
  plot_ly(x = ~Old, 
          y = ~Young, 
          size = ~Population, 
          color = ~Continent,
          sizes = c(2, 100),
          frame = ~Year, 
          text = ~Country, 
          hoverinfo = "text",
          type = 'scatter',
          mode = 'markers'
          ) %>%
  layout(showlegend = FALSE)
bp

References